Cory Whitney
"2019-03-13"
open RStudio
Help > Cheatsheets > Data Visualization with ggplot2
type ‘?’ in R console with function, package or data name
Add “R” to a search with a copy of an error message
Many talented programmers who scan the web and answer issues
R has several systems for making graphs
participants_data <- read.csv("participants_data.csv")
plot(participants_data$academic_parents)
Bar plot of number of observations of binary data related to academic parents
plot(participants_data$academic_parents, participants_data$days_to_email_response)
Boxplot of days to email response grouped by binary data related to academic parents
Use help '?' for function
?plot
R has several systems for making graphs
ggplot2 is one of the most elegant and most versatile.
it implements the grammar of graphics to describe and build graphs.
Do more faster by learning one system and applying it in many places.
Learn more about ggplot2 in “The Layered Grammar of Graphics”
library(ggplot2)
qplot(days_to_email_response, letters_in_first_name, data = participants_data)
Scatterplot of days to email response as a function of the letters in your first name
Use help '?' for function
?qplot
Want to understand how all the pieces fit together? See the R for Data Science book: http://r4ds.had.co.nz/
Example from Fisher's iris data set
qplot(Sepal.Length, Petal.Length, data=iris, color=Species, size=Petal.Width)
Scatterplot of iris petal length as a function of sepal length with colors representing iris species and petal width as bubble sizes.
Use help '?' for data
?iris
Example from your data
qplot(days_to_email_response, letters_in_first_name, color=academic_parents, size=working_hours_per_day, data=participants_data)
Scatterplot of letters in your first name as a function of days to email response with colors representing binary data related to academic parents and working hours per day as bubble sizes.
Make more graphs
Pearson's product-moment correlation
data: participants_data$days_to_email_response and participants_data$letters_in_first_name
t = -0.64191, df = 7, p-value = 0.5414
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.7780668 0.5078670
sample estimates:
cor
-0.2357798
Use help '?' for function
?cor.test
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point()+
theme_minimal() +
transition_states(dataset, 3, 1) +
ease_aes('cubic-in-out')
ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot() +
geom_point() +
transition_states(am, transition_length = 4, state_length = 1) +
view_follow()
?pdf
?png
Install Git & Github (if you do not already have them).
Git https://git-scm.com/downloads
Github http://r-pkgs.had.co.nz/git.html
join Github https://github.com/